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ABSTRACT 

A number of evaluation technioues for colleqe 
reading programs are described and discussed. The technioues 
indicated include (1) determining a clear definition of objectives 
and specific criterion tasks that are consistent with Droqram 
objectives, (2) using standardized tests for describing group chanqe, 
(7) analyzing academic achievement as shown in course qrades and 
grade-point averages, and (4) assessing students' needs and 



attitudes. It was pointed out that a 
to diagnose each individual student's 
progress. References are included. (PE) 
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The evaluation of a college reading program cannot be separated from the 
goals, objectives and practices of the program. Evaluation, of necessity, must 
differ from research studies. Rarely can the typical research paradigms be 



for special help because of their deficiencies or seek help voluntarily. To 
sot up randomly selected groups including non-treatment controls or even to 



gram is to offer service to students, not to experiment. Heterogeneous grouping 
is usually oithor impossible or undesirable, 

Evaluation seeks to ansuer questions like "Kow effective is our service in 
meeting our objectives?" "In what ways are students improving their skills?" 
as v;ell as the negative questions such a3 "V.’hat students fail to benefit from, 
our program?" and "Are some students harmed through their experiences in tho 
program?" 

Evaluation is essential in making decisions as to how tho service might 
bo improved, in planning and selecting materials and instructors, in determining 
whether a service should be expanded or contracted, and in justifying your 
existence to budget committees, 

Tho first step in evaluation is to dotemino your goals and specific 
objectives based, of course, on assessment of student and faculty expressed 
needs. If objectives can bo clearly' stated in behavioral terms, then the job 
of evaluating the program is easier. 
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Data need to be collected and accurate records kept. Evaluation nust be 
a continuous process beginning when the pro, •’ran is in the planning stages. In 
order that accurate and adequate data be collected, decisions as to what infor- 
mation is to be kept must be made at the start of the program, not after it is 
finished. Some kinds of data may be essential for annual reports of the service 
and budget committees. The usual demographic information collected about stu- 
dents who participate in the program include number of students by sex, year 
in college, curriculum, prior grade point average (if any), use made of service, 
tvpe of problem, scholastic aptitude tost scores, etc. Knowledge of these back- 
ground factors is useful for assessment and planning. 

limitations of Standardized Tests in Evaluation 

Although standardized tests are used in no3t programs for screening stu- 
dents in reading and study skills work, they have limited usefulness in 
evaluating the program. 7irst of all, most standarized tests do not measure 
the specific goals of most prorams and are only tangentially related to tho 
activities that students perform in the program. Traditionally, collogo reading 
tests measure reading rate (often on a very limited scale, such as the ono- 
ninuto timed sample on the ?Iols on -Denny, i most unrealistic time-sample of the 
reading of tho typical college studont), vocabulary and comprehension (usually 
based on understanding paragraphs.) If tie objective of a reading and study 
skills course is to teach students effective techniques for getting tho main 
ideas and significant details from a textbook chapter, scores on a standardized 
post-tec t are not likely to reveal much about their competence in mastering 
this skill. Dor does tho typical test show whether a student is reading 
flexibly for different purposes, can skim and scan, nor whethor ho can retain 
tho major concepts from studying a chapter for several day 3 until he has time 
to review it again, 
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Standardized tents in roaring are useful as predictors of college success, 
as screening instruments for picking students with exceptionally high or low 
scores, and for describing groups in general, '-.hat the standardised test does 
not do is "eveal snecific information about a given student's difficulties, nor 
does it necessarily reflect the progress he has made in pursuing a reading course. 

Also standardized tests, as they are often used, mask individual differ- 
ences particularly when means are used to describe the results of a course. 

Thev do not reveal whether a student whose scores have remained unchanged 
espite one's best efforts has actually been harmed by taking our program — 
for example, we all know students who have inappropriately taken a commercial 
speed reading course, but because they lack basic vocabulary and comprehension 
skills and failed to increase their speed wore left feeling even more inade- 
quate about their reading skills. 

Another problem in using standardised tests to measure changes in reading 
as a result of a program concoms working with deficient readers. ?or example, 
if the students are low in reading skills and sectioned into a reading workshop 
one would expect that their scores would improve through chance alone (regres- 
sion toward the mean.) Bvsn studies which have attempted to uso control 
groups havo their limitations. Usually in this case, a group of students who 
are not given the reading program is matched on tho basis of sox, colloge year, 
curriculum and read-'n'* ability with those who take advantage of tho reading 
course. If the reading program is a voluntary one, students who enter it nay 
be more highly motivated than thoso who havo not sought help oven though they 
need it equally as much. 

Most tests are confotsndod by tho rate factor (Stroud, 195S), The Uolson- 
Denny is a good example, since the slow reader is t pic ally unable to complete 
many of the vocabulary rr comprehension items, therefore his scores are low. 

Shoving him how to increase hie speod in taking tests nay result in improved scores. 
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.’mother problem cited by Davis (1961) is that of guessing. This can 
spuriously raise test scores particularly when administered to low-achieving 
students who nay randomly mark answers oven though they have not read the items. 

Alternate forms of standardized tests nay not be exactly parallel. For 
example, if raw scores are used in the computation of change on the Melson- 
Hcnny and Form A is administered first, wo find that a freshman student reading 
at 2p0 words per minute would score at the ?Cth percentile on the manual norms 
whereas a student scoring at 2£0 on Form B would be at the 60th percentile. 

At the upper levels students scoring at 376 words per minute on Form A 
would be at the 90th percentile, whereas a student scoring at only 3 >6 words 
per minute would be at the 90th percentile on Form 3. Obviously thore are 
differences throughout tho test norms that would affect the results of a 
pre-post-test comparison. This problem can bo handled by converting the raw 
scores into standard scores so that the student's rank within the group becomes 
the measure. However , sinco this involves some statistical operations and reading 
peoplo are typically averse to conput'ng* this is rarely used. As a result, nan;.’ 
of tho conclusions reported in tho literature that are based on ravr score data 
are spurious. Tracy and Rankin (196?) describe a residual gain statistic based 
on either of two computational methods — one derived from a Z-scoro formula for 
equating pro- and post-tost results and anothor form la using raw scores. Those are 
attempts to statistically equate tho pre-and post-tost scores of an individual. 

Tho authors stress that it is nocessary to compute and graph each individual 
reading class or group. 

Another weakness in using standardized tests is tho fact that ’unless one 
has developed local norms on his own institution, using nanual norms nay fco 
deceptive. For example, at tho University of Maryland wo found that entering 
freshmen at our school averaged scores comparable to the collogo senior norms 
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listed in the r.anual. Although there is some value in comparing y.ur group 
with national Groups for prestige and status, it is more important to know 
hov; an individual student ranks or a class ranks in regard to the specific 
institution in which they are enrolled. In other words, it is important to 
know where the student stands in relation to the competition in his own college 
on reading skills. In sirr.ary, standardised tests have limited usefulness in 
the assessment of a -specific college reading program for the following reasons: 

1) they rarely measure the objectives of the program that is being naught , 

?.) alternate forms nay give spurious results unless some ^-transformation or 
standard ccoro is commuted to equate the two, 3) if the reading program 
involves students who are weak in reading skills then regression toward the 
mean effects trill undoubtedly occur and mask any real changes, h) since stan- 
dardized tests are by definition both reliable and valid, they are not subject 
to change readily as a result of a brief instructional program, $) the use of 
mean score gains masks the vriablo that occurs in growth in the typical read- 
ing class. 

Academic Achievement as a Criterion 

Sinco effective roading and stuefcr skills are related to college auccoss, it 
has been generally accepted as a foregone conclusion that if you provide a pro- 
gram that offers students who are deficient in thoso skills tho opportunity to 
loam more effective techniques, their college grades will improve. However, in 
recent years it has been tho rare reading program that systematically assesses 
and reports grade point average improvement . For examplo, in a recent survey of 
17 compensator-* education programs in the California commit • colleges, only one 
program described the academic success of students in tho program with a control 
group who had not had special rending and study skills hoip, (Borg and .Intel, I960' 
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Ton years ago reading researchers wore much more adamant about the necessity 
of us in,'-; grados as a criterion. Entwisle and Intvisle (i960) stated f at im- 
provement in overall scholastic average as compared with a control gr oup was the 
only adequate criterion measure of improvement following a colie me study skills 
course. In summarising Di studies, they concluded that the nodal pain in over- 
all G.P.A, was between ,h and .9 of a fjrade point and further noted that im- 
provement is "almost always maintained when follow-up studies are clone." 

’•/right (1962) in rovie’./iny 31 studios -which purported to measure the rela- 
tionship between reading training and college success, found only 11 with com- 
parable control groups and only 7 of these reported significant improvement in 
grades for the students taking reading improvement programs. He concluded that 
the differential results could bo attributed to othor variables sue! 1 , as the c’.u 1 - 
riculir. studied by the student, personality differences between st -.dents, nature 
of the training program, length of courso and ca.potenco of the instruction. 

’/right further describes his study in which students were randomly assigned 
to control or experimental groups and both groups were rotestod on reaming a- 
bility at tho end of the academic year. Two grade point averages were computed 
for each student in tho study: one based on grades in English, social studies and 
humanities courson (Verbal G.P.A.) and one on science and mathematics courses 
(Quantitative G.P.A.). Tho experimental subjects who completed the reading course 
not only showed significantly higher scores on all tho lolson-Donny sub-tests but 
also had significantly.'- higher vorbal G.P.A. ’s at tho end of the year then tho 
controls did. Harovor, thoro wore no differences between those taking tho pro- 
gram and controls on Quantitative G.P.A. 

Although the najority of studios roporVng of foots of reading, and stud;,' 
skills pro-rams on Improvement in grades show favorable results, thero remains 
tho question of tho representativeness of tho roported studios einco editors ur.- 
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doubtedly view studies with positive results as 
than those with negative results. Furthermore , 
the use of overall f . ~.A. may not be :■ ealistic 



nore desirable for publication 
as ’Jrifiht (19c2) has demonstrated 
in assessing reading and study 



skills programs, since the majority of college programs stress English and social 
studies reading and place minimal emphasis on skills in science and mathematics, 

A recent stud 1 ' by King, Della nd and Halter (196?) reports that students at- 
tending the University of Missouri Reading Improvement Program over a six-year 
period did not show significantly higher post-grade -point-averages than a control 
group. However, through analyzing their data by grouping students according to 
initial reading rates, they found that only the middle group (those whose initial 
reading speeds were between 200 and 2!>0 words per minute) shoved significant im- 
provement in grade point averages, (Students reading slower than 200 words per 
minute or above 2g0 words per ninuto initially did not show grade improvement.) 
They concluded that the students reading in the 200 to 2?0 words per ninuto group 
initially were at a level whore increased reading rate would make a significant 
difference in their studying whilo those reading nore slowly initially probably 
had attitudinal problems, were porfectionistic or compulsive readers and hence 
harder to chango, Thoso reading above 2g0 words per minute initially, they feol, 
were probably alieady reading voll enough to keep pace irith their college assign- 
ments. 

This stud;' also illustrates the complexity of tho relationships between im- 
provement in reading skills and credos and points up the needs for carefully 
thought out and voll designed studies. 

In conclusion, it is important for the collego reading administrator to col- 
lect data on students* pro- and post-grados, but it is equally important that 
thoso bo viewed in terms of tho specific objectives of tho reading program. It 
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English course grades were based primarily on the student's 





rents night be a more relevant criterion in the latter cose 



Assessing voluntary self-help reading and stud' 



skills nrorrror. roses addi- 






a more heterogeneous "roup of students including some with honors grades as well 
as those with low achievement, 

Straight-A upper division students way bo attracted to tho program in hopes 
of maintaining their averages with less of fort, while loss capable students nay 
need intensive work in basic skills. Attempting to combine ouch divergent stu- 
dents into ono group and examining mean pro -and po3t~ Grade Point Averages would 
have little meaning. However, examining how long high, avorage and lctf-achicving 
students remain in the program and what the - '' accomplish does have value in 
developing insights into the characteristics of students who profit from the 
program and in planning ways it could ho improved. For example, at tho "niv- 
ersity of Farmland wo found that students with high l’eading scoro profiles 

tended to romain in a voluntary reading program longer than those vrLth average 

or low profiles. (Haxwell, l£65) This findin" prompted us to reexamine our 
prograi procedures to riotomine hovr wo could hotter help the students with poor 
skills cope Tfith college demands. Although those students need tho sendee 
more than the others, they are also handicapped in finding tho time to do vote 

to skills improvement when heavy course donands tako all of thoir tino. 
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Rrr.de noint averages on students are useful to have but probaoly individual 
course grades, particularly those which have relevance to the objectives of the 
reading and study shills pro-ran or the specific area on which a student has 
worked seen nore important to collect and analyse. 



Certainly r.oro carefully controlled studios on the effectiveness of college 
reading and study skills programs for the disadvantaged need to be rede . Phase 
kinds of studies do raise ethical and political questions, for they require 
that an equally deficient and equally motivated group of students be deprived 
of iho "benefits 1 ' of the program and serve as a control group. Chose questions 
mitigate against using traditional experimental methodology and force vs to 
look for meaningful but less direct and different ways of evaluating our pro- 
grams. However, if clear behavioral objectives are stated at the beginning of 
the pro -ran, data can be systematically collected on tho percentage of the 
group that achieves the criterion by tho end of tho program. (Such objectives 
night include the ability to road and answer a general discussion question 
about a chapter in history in 30 minutes, or to skin an essay to determine the 
author's main promises for his argument in 3 minutes , otc.) io tho uxto.it 
that these tasks represent "job samples" of the assignments the students are 
expected to do in courses, then oce night legitimately expect that students \ 1 10 
learned these skills would attain higher grades in the specific course. If it 
is determined that performance on those tasks is not related to specific grades, 
thon tho reading director should try to determine whether tho skills have not 
been adequately learned or whether thoy aro inappropriate or irrelevant for tho 
course in question. 
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Student *3 Attitude s 

Ironside (1969) stresses the need for students to bo involved in assessing 
both the goals and their progress in a roading program, Ho also mentions the 
over-use of single factor tests in evaluating a program that involves many 
skills and rococunends a grid of 21 skills which can be used to sot goals and 
can also bo used to dovelop critr- .*io:i tasks for assessing the r.ost com on 
reading skills that could be taught in a courso. Ho stresses that feedback to 
the student is essential so that objectives can bo riatched to the instructional 
program and progress assessed, 

Wood (1961) proposed attrition a3 a criterion for evaluating non-credit 
reading programs assuring that if students in a voluntary program persisted 
then tills would suggest that they were gaining something from it, 

Knaflo (i 960 ) studied porsonalit - characteristics of students enrolled in 
a reading and study skills program and instructors’ ratings and found that for 
poor readers , instructors apparently used different criteria to assess improve- 
ment. Students with higher scores in dominance who wore poor readers were more 
likoly to ’e assessed by instructors as making greater improvement in reading 
than students with bettor reading skills who were equally dominant. On the 
other hand, students with higher reading scores who had high scores on achieve- 
ment via independence on the California Psychological Inventory were assessed 
as making greater improvement than those with low scores on this dimension. 

Thus there seemed to be interraction effects in terns of teacher expectations 
and personality patterns of students persisting in a reading program. 

Post-questionnaires assessing' students' attitudes toward the program are 
frequently used and can provide valuable information about student reactions 
and also servo to give the students an opportunity to express their feelings 
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about a program. Student evaluations, provided the/ are anonymous, vri.ll yield 
valuable insights . Such questionnaires are subject to the halo effect ar.d 
should be anonymous to not maximum information. 

In summary then, there are a number of techniques that can be rood to 
aes033 coll 3 . to reading programs. l!o3t important is to clearly defino your 
objectives and set specific criterion tasl:3 that are consistent with the objec- 
tives of the program. Standardised tests can bo used to doscribo group changes, 
but have their limitations if the program's viability is to hinge on the perform- 
ance of students at the end of a program. Certainly information about grades 
and grade point averages should bo collected since in essence most of our 
programs do aim to help students improve in their acador.ic work. J.f tho program 
is to bo strengthened then it should be built on the students ' needs and with- 
out objective knowledge about the kinds of students who do succeed or fail in 
the program, it is difficult to do long-range planning. If a reading and study 
skills program is restricted to low-ability or low-achieving students, then 
the problem of stigma being associated with the service may be a veal one. This 
may affect the students' progress in the course and their attitudes toward the 
reading specialists who run it. I$ r personal conviction is that collogo reading 
programs should meet the needs of all students vho want help and this implies 
using a variety of techniques diagnosing each individual student's needs, 
and evaluating his progress in these particular skills. 
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